Goto

Collaborating Authors

 gap function








Quantification of Sim2Real Gap via Neural Simulation Gap Function

Sangeerth, P, Jagtap, Pushpak

arXiv.org Artificial Intelligence

In this paper, we introduce the notion of neural simulation gap functions, which formally quantifies the gap between the mathematical model and the model in the high-fidelity simulator, which closely resembles reality. Many times, a controller designed for a mathematical model does not work in reality because of the unmodelled gap between the two systems. With the help of this simulation gap function, one can use existing model-based tools to design controllers for the mathematical system and formally guarantee a decent transition from the simulation to the real world. Although in this work, we have quantified this gap using a neural network, which is trained using a finite number of data points, we give formal guarantees on the simulation gap function for the entire state space including the unseen data points. We collect data from high-fidelity simulators leveraging recent advancements in Real-to-Sim transfer to ensure close alignment with reality. We demonstrate our results through two case studies - a Mecanum bot and a Pendulum.


Learning Variational Inequalities from Data: Fast Generalization Rates under Strong Monotonicity

Zhao, Eric, Chavdarova, Tatjana, Jordan, Michael

arXiv.org Machine Learning

Variational inequalities (VIs) are a broad class of optimization problems encompassing machine learning problems ranging from standard convex minimization to more complex scenarios like min-max optimization and computing the equilibria of multi-player games. In convex optimization, strong convexity allows for fast statistical learning rates requiring only $\Theta(1/\epsilon)$ stochastic first-order oracle calls to find an $\epsilon$-optimal solution, rather than the standard $\Theta(1/\epsilon^2)$ calls. In this paper, we explain how one can similarly obtain fast $\Theta(1/\epsilon)$ rates for learning VIs that satisfy strong monotonicity, a generalization of strong convexity. Specifically, we demonstrate that standard stability-based generalization arguments for convex minimization extend directly to VIs when the domain admits a small covering, or when the operator is integrable and suboptimality is measured by potential functions; such as when finding equilibria in multi-player games.


Direct Gradient Temporal Difference Learning

Qian, Xiaochi, Zhang, Shangtong

arXiv.org Artificial Intelligence

Off-policy learning enables a reinforcement learning (RL) agent to reason counterfactually about policies that are not executed and is one of the most important ideas in RL. It, however, can lead to instability when combined with function approximation and bootstrapping, two arguably indispensable ingredients for large-scale reinforcement learning. This is the notorious deadly triad. Gradient Temporal Difference (GTD) is one powerful tool to solve the deadly triad. Its success results from solving a doubling sampling issue indirectly with weight duplication or Fenchel duality. In this paper, we instead propose a direct method to solve the double sampling issue by simply using two samples in a Markovian data stream with an increasing gap. The resulting algorithm is as computationally efficient as GTD but gets rid of GTD's extra weights. The only price we pay is a logarithmically increasing memory as time progresses. We provide both asymptotic and finite sample analysis, where the convergence rate is on-par with the canonical on-policy temporal difference learning. Key to our analysis is a novel refined discretization of limiting ODEs.


Accelerated Primal-Dual Methods for Convex-Strongly-Concave Saddle Point Problems

Khalafi, Mohammad, Boob, Digvijay

arXiv.org Artificial Intelligence

We investigate a primal-dual (PD) method for the saddle point problem (SPP) that uses a linear approximation of the primal function instead of the standard proximal step, resulting in a linearized PD (LPD) method. For convex-strongly concave SPP, we observe that the LPD method has a suboptimal dependence on the Lipschitz constant of the primal function. To fix this issue, we combine features of Accelerated Gradient Descent with the LPD method resulting in a single-loop Accelerated Linearized Primal-Dual (ALPD) method. ALPD method achieves the optimal gradient complexity when the SPP has a semi-linear coupling function. We also present an inexact ALPD method for SPPs with a general nonlinear coupling function that maintains the optimal gradient evaluations of the primal parts and significantly improves the gradient evaluations of the coupling term compared to the ALPD method. We verify our findings with numerical experiments.